Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ssr #559

Merged
merged 5 commits into from
Oct 16, 2024
Merged

Fix ssr #559

merged 5 commits into from
Oct 16, 2024

Conversation

nehalsinghmangat
Copy link
Collaborator

Fixes #532. Original SSR implementation only used an l0 penalty which defaulted to 0, meaning the full non-sparse model was always chosen. However, the SSR paper does model selection based upon the "inflection point" of the error history. We implemented the function _ind_inflection which takes as input the error history of the SSR iterations and returns the index at which the inflection occurs, using the metric for inflection described in the SSR paper. This model selection based on error history inflection is the new default when the l0 penalty is 0. The API does not change, but you can note the different results in the unittest test_ssr_history_selection.

Index alignment between coefficient and error histories was thrown off because
BaseOptimizer initializes history with an unregularized regression, used in
iterative nonconvex regressions.  However, SSR does not require an initial
guess.

This commit removes the initial guess from SSR history_ and adds a test.
Previously, SSR has simply been finding the 0-sparsity model.  Instead of the
algorithm in the paper, it was executing a Bayes Information Criterion-like
model selection, but with a very low l0 penalty.

This test asserts #532
Closes #532

This commit implements the inflection point criteria from the SSR paper,
although still measuring training loss, rather than cross validation loss.
Also change test to use better-conditioned feature library, since l0 penalty
is based upon condition number.

Also, fix use of inflection point model selection.  Because error history is
reversed before calculating inflection index, index is from the end of the list

Also, extract calculating the index of inflection in SSR (+ tests)
Also, hack SSR through tests that provide single-feature data in the same
way that trapping was handled.
Copy link

codecov bot commented Sep 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.31%. Comparing base (23da590) to head (18bac4f).
Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #559      +/-   ##
==========================================
+ Coverage   95.30%   95.31%   +0.01%     
==========================================
  Files          37       37              
  Lines        4046     4059      +13     
==========================================
+ Hits         3856     3869      +13     
  Misses        190      190              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Jacob-Stevens-Haas
Copy link
Member

@himkwtn @yb6599 @MPeng5, if you're interested in doing a review, this is a fix Nehal and I were working on to the SSR optimizer.

No need to review, it's just an opportunity if you're interested. I'll merge it in a week if nobody picks up the review.

@Jacob-Stevens-Haas Jacob-Stevens-Haas merged commit 4d27145 into master Oct 16, 2024
8 checks passed
@Jacob-Stevens-Haas Jacob-Stevens-Haas deleted the fix-ssr branch October 16, 2024 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] SSR optimizer always picks the first model it generates
2 participants